53 research outputs found

    Robust Active Distillation

    Full text link
    Distilling knowledge from a large teacher model to a lightweight one is a widely successful approach for generating compact, powerful models in the semi-supervised learning setting where a limited amount of labeled data is available. In large-scale applications, however, the teacher tends to provide a large number of incorrect soft-labels that impairs student performance. The sheer size of the teacher additionally constrains the number of soft-labels that can be queried due to prohibitive computational and/or financial costs. The difficulty in achieving simultaneous \emph{efficiency} (i.e., minimizing soft-label queries) and \emph{robustness} (i.e., avoiding student inaccuracies due to incorrect labels) hurts the widespread application of knowledge distillation to many modern tasks. In this paper, we present a parameter-free approach with provable guarantees to query the soft-labels of points that are simultaneously informative and correctly labeled by the teacher. At the core of our work lies a game-theoretic formulation that explicitly considers the inherent trade-off between the informativeness and correctness of input instances. We establish bounds on the expected performance of our approach that hold even in worst-case distillation instances. We present empirical evaluations on popular benchmarks that demonstrate the improved distillation performance enabled by our work relative to that of state-of-the-art active learning and active distillation methods

    Learning-Augmented Weighted Paging

    Full text link
    We consider a natural semi-online model for weighted paging, where at any time the algorithm is given predictions, possibly with errors, about the next arrival of each page. The model is inspired by Belady's classic optimal offline algorithm for unweighted paging, and extends the recently studied model for learning-augmented paging (Lykouris and Vassilvitskii, 2018) to the weighted setting. For the case of perfect predictions, we provide an \ell-competitive deterministic and an O(log)O(\log \ell)-competitive randomized algorithm, where \ell is the number of distinct weight classes. Both these bounds are tight, and imply an O(logW)O(\log W)- and O(loglogW)O(\log \log W)-competitive ratio, respectively, when the page weights lie between 11 and WW. Previously, it was not known how to use these predictions in the weighted setting and only bounds of kk and O(logk)O(\log k) were known, where kk is the cache size. Our results also generalize to the interleaved paging setting and to the case of imperfect predictions, with the competitive ratios degrading smoothly from O()O(\ell) and O(log)O(\log \ell) to O(k)O(k) and O(logk)O(\log k), respectively, as the prediction error increases. Our results are based on several insights on structural properties of Belady's algorithm and the sequence of page arrival predictions, and novel potential functions that incorporate these predictions. For the case of unweighted paging, the results imply a very simple potential function based proof of the optimality of Belady's algorithm, which may be of independent interest

    Caching with Reserves

    Get PDF
    Caching is among the most well-studied topics in algorithm design, in part because it is such a fundamental component of many computer systems. Much of traditional caching research studies cache management for a single-user or single-processor environment. In this paper, we propose two related generalizations of the classical caching problem that capture issues that arise in a multi-user or multi-processor environment. In the caching with reserves problem, a caching algorithm is required to maintain at least k_i pages belonging to user i in the cache at any time, for some given reserve capacities k_i. In the public-private caching problem, the cache of total size k is partitioned into subcaches, a private cache of size k_i for each user i and a shared public cache usable by any user. In both of these models, as in the classical caching framework, the objective of the algorithm is to dynamically maintain the cache so as to minimize the total number of cache misses. We show that caching with reserves and public-private caching models are equivalent up to constant factors, and thus focus on the former. Unlike classical caching, both of these models turn out to be NP-hard even in the offline setting, where the page sequence is known in advance. For the offline setting, we design a 2-approximation algorithm, whose analysis carefully keeps track of a potential function to bound the cost. In the online setting, we first design an O(ln k)-competitive fractional algorithm using the primal-dual framework, and then show how to convert it online to a randomized integral algorithm with the same guarantee

    Efficient caching with reserves via marking

    Get PDF
    Online caching is among the most fundamental and well-studied problems in the area of online algorithms. Innovative algorithmic ideas and analysis – including potential functions and primal-dual techniques – give insight into this still-growing area. Here, we introduce a new analysis technique that first uses a potential function to upper bound the cost of an online algorithm and then pairs that with a new dual-fitting strategy to lower bound the cost of an offline optimal algorithm. We apply these techniques to the Caching with Reserves problem recently introduced by Ibrahimpur et al. [10] and give an O(log k)-competitive fractional online algorithm via a marking strategy, where k denotes the size of the cache. We also design a new online rounding algorithm that runs in polynomial time to obtain an O(log k)-competitive randomized integral algorithm. Additionally, we provide a new, simple proof for randomized marking for the classical unweighted paging problem

    SLaM: Student-Label Mixing for Distillation with Unlabeled Examples

    Full text link
    Knowledge distillation with unlabeled examples is a powerful training paradigm for generating compact and lightweight student models in applications where the amount of labeled data is limited but one has access to a large pool of unlabeled data. In this setting, a large teacher model generates ``soft'' pseudo-labels for the unlabeled dataset which are then used for training the student model. Despite its success in a wide variety of applications, a shortcoming of this approach is that the teacher's pseudo-labels are often noisy, leading to impaired student performance. In this paper, we present a principled method for knowledge distillation with unlabeled examples that we call Student-Label Mixing (SLaM) and we show that it consistently improves over prior approaches by evaluating it on several standard benchmarks. Finally, we show that SLaM comes with theoretical guarantees; along the way we give an algorithm improving the best-known sample complexity for learning halfspaces with margin under random classification noise, and provide the first convergence analysis for so-called ``forward loss-adjustment" methods

    Scheduling with Communication Delay in Near-Linear Time

    Get PDF
    We consider the problem of efficiently scheduling jobs with precedence constraints on a set of identical machines in the presence of a uniform communication delay. Such precedence-constrained jobs can be modeled as a directed acyclic graph, G = (V, E). In this setting, if two precedence-constrained jobs u and v, with v dependent on u (u ? v), are scheduled on different machines, then v must start at least ? time units after u completes. The scheduling objective is to minimize makespan, i.e. the total time from when the first job starts to when the last job finishes. The focus of this paper is to provide an efficient approximation algorithm with near-linear running time. We build on the algorithm of Lepere and Rapine [STACS 2002] for this problem to give an O((ln ?)/(ln ln ?))-approximation algorithm that runs in O?(|V|+|E|) time

    Indexing boolean expressions.

    Get PDF
    ABSTRACT We consider the problem of efficiently indexing Disjunctive Normal Form (DNF) and Conjunctive Normal Form (CNF) Boolean expressions over a high-dimensional multi-valued attribute space. The goal is to rapidly find the set of Boolean expressions that evaluate to true for a given assignment of values to attributes. A solution to this problem has applications in online advertising (where a Boolean expression represents an advertiser's user targeting requirements, and an assignment of values to attributes represents the characteristics of a user visiting an online page) and in general any publish/subscribe system (where a Boolean expression represents a subscription, and an assignment of values to attributes represents an event). All existing solutions that we are aware of can only index a specialized sub-set of conjunctive and/or disjunctive expressions, and cannot efficiently handle general DNF and CNF expressions (including NOTs) over multi-valued attributes. In this paper, we present a novel solution based on the inverted list data structure that enables us to index arbitrarily complex DNF and CNF Boolean expressions over multi-valued attributes. An interesting aspect of our solution is that, by virtue of leveraging inverted lists traditionally used for ranked information retrieval, we can efficiently return the top-N matching Boolean expressions. This capability enables emerging applications such as ranked publish/subscribe system

    Analysis and measurment on RFID tagging of video projectors

    No full text
    Projectiondesign AS utvikler og spesialtilpasser prosjektører for ulike markeder som, profesjonell simulering og visualisering, medisinsk bildebehandling, hjemmekino, forretningsbruk og andre utfordrende områder. Projectiondesign er en norsk bedrift som holder til i Fredrikstad, hvor utvikling og produksjon av prosjektørene finner sted. Herfra blir produktene sendt videre til et verdensomspennende marked. Den eksisterende merkingsmetoden på produkt er i form av produktnummer og serienummer med tilhørende strekkode, dette er merket både på produkt og emballasje. Det var ønskelig å undersøke mulighetene for å implementere RFID-teknologi på projectiondesigns produkter, for å se hvilke fordeler og muligheter dette kan gi. Et RFID-system består i sin helhet av en transponder som festes til objektet som skal identifiseres, og en leserenhet som kan registrere transpondere som befinner seg innenfor leserens rekkevidde. Kommunikasjonen mellom leser og transponder skjer trådløst med elektrisk, magnetisk eller elektromagnetisk kobling. Selv om teknologien har vært i komersiell bruk i over 20 år, er det ikke før i de siste årene at teknologien har blitt mer konkurransedyktig og tatt markedsandeler. Dette er grunnet med kostnadseffektive produksjonsmetoder, nye og bedre standardiseringer og økt utbredelse av RFID-teknologi. På forespørsel ble et ferdig emballert produkt mottatt fra projection design. Produktet viste seg å ha et metallchassis, og grunnet en antennes egenskaper nær et jordplan ble det ufordende å merke produktet med RFID. Ved å montere ekstra transpondere på emballasjen ble den begrensede rekkevidden for RFID i nærheten av metalliske materialer imøtekommet. Med grunnlag i undersøkelser gjort av RFID-teori og tilgang på RFID-utstyr, ble det besluttet å gjøre undersøkelser på passiv UHF-teknologi av type <>, som er en utbredt og kostnadseffektiv teknologi med global kompabilitet. Transponderentypen av type <> [Tillegg B] og <> [Tillegg E] ble valgt for montering på henholdsvis produkt og emballsje. Måleresultater av rekkevidden for transponder montert på prosjektør er sammenlignet med strålingsegenskapene for antennen, som er målt for en antennemodell av transponderen. Måling av rekkevidde for transponder montert på emballasje er gjort for å imøtekomme den begrensede leseavstanden for transponder montert på prosjektør, og er ikke sammenlignet med analytiske målinger
    corecore